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In the last few decades, technological and experimental advancements have enabled a 
more precise understanding of the mode of action of drugs with respect to human cell 
signaling pathways and have positively influenced the design of new drug compounds. 
However, as the design of compounds has become increasingly target-specific, the overall 
effects of a drug on adjacent cellular signaling pathways remain difficult to predict because 
of the complexity of the interactions involved. Off-target effects of drugs are known to 
influence their efficacy and safety. Similarly, drugs which are more target-specific also 
suffer from lack of efficacy because their scope might be too limited in the context of 
cellular signaling. Even in situations where the signaling pathways targeted by a drug are 
known, the presence of point mutations in some of the components of the pathways 
can render a therapy ineffective in a considerable target subpopulation. Some of these 
issues can be addressed by predicting Minimal Intervention Sets (MIS) of elements of 
the signaling pathways that when perturbed give rise to a pre-defined cellular phenotype. 
These minimal gene perturbation sets can then be further used to screen a library of 
drug compounds in order to discover effective drug therapies. This manuscript describes 
algorithms that can be used to discover MIS in a gene regulatory network that can lead to a 
defined cellular phenotype. Algorithms are implemented in our Boolean modeling toolbox, 
GenYsis. The software binaries of GenYsis are available for download from http://www.vital- 
it.ch/software/genYsis/. 
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1. INTRODUCTION 

Advancements in high-throughput technologies have enabled 
biologists to measure the expression (or activity) of many genes 
simultaneously. By measuring and comparing the expression of 
genes in a normal vs. diseased cell phenotypes, biologists have 
been able to identif)' various key genes involved in disease path- 
ways. However, predicting a set of genes in a disease is often not 
sufficient as it may not be possible to directly manipulate the 
expression of those genes using available drug compounds. This 
necessitates studying the interactions of genes that are known 
drug targets with respect to the genes (or proteins) that are impli- 
cated in high-throughput analysis of specific disease phenotypes. 
This observation has shifted the focus of computational and 
experimental tools from studying individual genes to understand- 
ing the underlying gene regulatory networks (GRNs) of biological 
processes. By modeling GRNs, one would like to gain a deeper 
understanding of how different cellular phenotypes arise from the 
same set of underlying genes and how a biological system can be 
forced to differentiate into a specific phenotype by manipulating 
the expression of a small set of genes/proteins in the underlying 
GRN. 



Even with an improved understanding of the working of 
cellular signaling components and technological advancements 
in designing new drug compounds, it is often not possible to 
design drug compounds that specifically target only the desired 
genes/proteins with desired effectiveness. Some of these issues can 
also be addressed by predicting which elements of the GRNs have 
to be targeted in order to attain a pre-defined phenotype. The lat- 
ter problem is formally referred to as Minimal Intervention Sets 
(MIS) problem in the literature (Karlebach and Shamir, 2010; 
Samaga et al, 2010). In an MIS problem, one would like to 
enumerate a list of possible sets of genes/proteins from a given 
GRN, which when perturbed, can give rise to the desired cellular 
phenotypes. 

A GRN, such as the one in Figure lA, can be modeled as a 
Boolean network using the mapping we have introduced earlier 
in Garg et al. (2009). In the Boolean modeling of a GRN, a node 
can exist in two expression states: low and high. These two expres- 
sion states are represented by logic 0 and 1, respectively. A state of 
the network at a given time instant f is defined by the expression 
state of all the nodes at that instant of time. A state of the net- 
work evolves over time as defined by the underlying interactions 
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FIGURE 1 I (A) A toy GRN representing interactions between a set of 
genes/proteins. Arrow-headed edges represent activation and circle-lieaded 
edges represent inliibiting interactions. (B) MIS patters to push the 
systems into a steady state where 7" = 1 . For every MIS vector there is at 
least one mutations that has to be present. The polarity of these mandatory 
mutations are indicated by red (knock-out) or blue (over-express) colors. In 
addition, compatible off-target mutations and their polarities 
(over-expression or knock-out) are also listed for every MIS vector. Genes 
correponding to white color in an MIS vector indicate the mutations in 
these genes have no effect on the corresponding MIS vector. Genes 
correponding to gray color in an MIS vector should not be perturbed in any 
polarity (either by knock-out or over-expression). Else the corresponding 
MIS vector may not be able to generate the desired steady state. 



and stabilizes into a steady state or an attractor. An attractor is a 
set of states, such that once the network simulation reaches into 
one of the states of the attractor, it can only transition among the 
states within that attractor. An attractor with only a single state 
is referred to as a steady state. Here, we use the words attractors 
and steady states in the similar context of terminal states of the 
network simulations. Attractors of Boolean networks have been 
shown to correspond to the cellular phenotypes for various GRNs 
in the past (Kauffman, 1969; Mendoza and Alvarez-Buylla, 1998; 
Mendoza et al, 1999; Huang et al, 2005; Mendoza, 2005; Faure 
et al, 2006; Klamt et al, 2006; Davidich and Bornholdt, 2008). 
Also it has been widely shown how the impact of gene perturba- 
tions can be mapped to experimental outcomes in terms of the 
steady states. Introducing a gene perturbation in a GRN, wherein 
a node is either over-expressed (i.e., constantly high expression 
state) or knocked-down (i.e., constantly low expression state) can 
change the steady states of the GRN and even push the system 
from one steady state into another steady state. Modeling the 
system-wide behavior of a GRN in terms of transition from one 
steady state to another in response to small perturbations can be 
used to systematically discover combinations of perturbations or 
MIS that can be of therapeutical advantage. 

An MIS pattern gives the minimal possible combinations of 
gene perturbations required to force the GRN into a desired 
steady state and the term minimal in MIS implies that any other 
sub-set of an MIS pattern can not result in the same steady state 
behavior. However, more than one MIS patterns can generate the 



same steady state and for many practical applications, such as for 
screening a library of drug compounds, it would be necessary to 
enumerate a large number of such MIS patterns. An example of 
a list of MIS patterns that can force the GRN in Figure lA into 
a steady state where the node T is at a high expression level (i.e., 
r = 1) are listed in Figure IB. Although, most MIS patterns may 
seem trivial and involve regulating the direct upstream nodes of 
the node T, some non-trivial MIS patterns such as MIS 1 and 
MIS 2 may exist. The pattern MIS 1 states that the node Q should 
be knocked-out and node A should be over-expressed for forc- 
ing the GRN in Figure lA into a steady state where T = 1. Such 
non-trivial MIS patterns that require multiple simultaneous per- 
turbations are of utmost interest to algorithms developed in this 
manuscript. 

The MIS problem has been addressed relatively sparingly in 
the past. Two recently published manuscripts by (Karlebach and 
Shamir, 2010; Samaga et al., 2010) propose algorithms to com- 
pute such minimal sets of targets in a given GRN. However, the 
algorithms proposed in both the manuscripts do not scale well 
with the size of the signaling pathway when it is desirable to enu- 
merate MIS that would require more than three genes/proteins 
in the pathway to be simultaneously intervened. Moreover, these 
algorithms are not suitable for computationally screening drug 
compounds against pathways, as it is very rare to find drug 
compounds that would specifically target only a small list of 
genes/proteins highlighted in a given MIS. Therefore, in order 
to effectively screen drug compounds against pathways, it is nec- 
essary to also report compatible off-target genes/proteins along 
with every MIS of genes/proteins. Enumeration of compatible 
off-targets along with MIS has been completely disregarded in 
the previous works, rendering those algorithms of less practical 
significance. 

The MIS algorithms proposed in this manuscript also report a 
list of compatible off-targets along with every minimal interven- 
tion set. We demonstrate the advantages of reporting off-targets 
by applying our algorithms on a database of miRNAs and their 
corresponding gene targets in order to discover disregulated miR- 
NAs in cancer pathways. Our algorithm can also find many MIS, 
which requires taking into account the initial state of the pathway 
that corresponds to the resting state of the cell when the gene per- 
turbation should be applied experimentally. We demonstrate the 
importance of taking into account the initial expression state of 
the GRN by taking the example of T- Helper GRN. We show that 
many therapeutically interesting MIS patterns for pushing the T- 
Helper cells into Thl phenotype can be discovered only when the 
initial state of the T-helper GRN is set into the ThO steady state. 

2. METHODS 

Algorithms 1-4, presented in Figures 5, 6, describe the method- 
ology followed to find a list of MIS patterns, given a GRN G, a 
target node T (which summarizes the desired steady state), and 
Boolean variable outPolarity (which specifies the desired polar- 
ity of the target node, i.e., 0 or 1). The function comp_MIS() as 
described in Algorithm 1 is the core function of our MIS genera- 
tion methodology. The function comp_MIS() starts by unrolling 
the GRN into a tree-like structure starting from the target node 
T which has a fixed polarity (i.e., either high or low expression) 
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in the desired final steady state (Line 3 of Algorithm 1). These 
nodes with fixed polarities are referred to as the root nodes of the 
network. 

Algorithm 2 describes the GRN unrolling function 
UnrollGRNQ. The function UnrollGRNQ is called recur- 
sively starting from the root node (or target node) T. The GRN 
is unrolled along a path until a duplicate node is found. At that 
instance, if all the input nodes to the duplicate node already exist 
on the unroUed path, then the duplicate node is assigned a new 
name (with the symbol "~" over the original node name) and 
the unrolling process is terminated along this path. Otherwise, 
the duplicate node is farther unrolled until the criteria for 
terminating the unrolling process is met. In each recursion, a set 
of nodes on the current path is maintained in the set of already 
unroUed nodes (labeled as aN). The current node being unrolled 
is added to set aN in Line 5 of Algorithm 2 and the input nodes 
(labeled as g) to the current node being unrolled (labeled as S„) 
are checked for duplicacy in Line 7. If the input node g does not 
exist in the set aN, then the UnrollGRNQ process is repeated for 
that input node. In Lines 11-13, we check if any of the input 
nodes g to the node g does not belong to the set of duplicate 
nodes aN, in which case the node g is further unrolled (Line 
15). Otherwise the unrolling process is terminated along the 
current path by assigning the node ID corresponding to initial 
state nodes. The orglndex vector maintains the index ID of all the 
nodes in the input GRN. The UnrollGRNO process returns a new 
node ID corresponding to unrolled node counterpart of original 
node S„. The new node IDs are generated by incrementing 
the counter corresponding to original set of nodes in the GRN 
in the vector indexCount (Line 18 of Algorithm 2). Vector 
indexCount maintains a count of the number of times a given 
node in the GRN is duplicated in the unrolling process, and 



helps ensuring that no two nodes have the same node ID in the 
unrolled GRN. The unrolled function with new unrolled node 
IDs corresponding to input nodes to node S,, and new index of 
S„ (i.e., outlndex) is generated in the function composeFuncQ 
(described in Lines 24-30 of Algorithm 2). The composeFuncQ 
duplicates the Boolean function describing the node S„ in the 
original GRN and renames the input and output node with 
new node IDs that are generated in the function UnrollGRNQ. 
An offTargetlndex (in Line 20 of Algorithm 2) is a vector of 
the set of nodes that contains aU nodes that lie in between the 
current node being unrolled S„ and the root node T of the 
unrolled GRN. The set of nodes in the offTargetlndex are used 
for specifying compatible off-targets corresponding to each MIS 
in Algorithm 3. The unrolled network generated by applying the 
UnrollGRNQ function on the GRN in Figure lA is shown in 
Figure 2A. 

Once the input GRN is unrolled. Algorithm 1 generates the 
MIS vectors by recursively calling the function genMISQ starting 
from the unrolled index of target node T in Line 5 of Algorithm 1 . 
The genMISQ is described in detail in Algorithm 3. The MIS vec- 
tors are generated by scanning the unrolled network from the 
root node (i.e., target node T) toward the leaf nodes. The func- 
tion genMISQ is called recursively for the inputs of a given node 
(specified by currNode in Algorithm 3). The MIS vectors corre- 
sponding to input nodes are merged as described in the function 
mergeMISQ. The merged MIS vectors are combined with the new 
MIS vector which is generated for the given currNode using the 
function newMISQ. The mergeMISQ function as described in 
Algorithm 3, starts by setting Boolean variable flagOR to TRUE 
for Boolean input-output relationships that should be treated as 
an OR function (Lines 1 1-17). If the flagOR variable is TRUE, then 
the MIS vectors corresponding to input nodes to currNode are 
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merged using the MIS_OR function (as decribed in Algorithm 4). 
Otherwise the input MIS vectors are merged using the MIS_AND 
function. The newMISQ function, generates a new MIS vector for 
the currNode. It returns a Boolean vector of length six times the 
number of number of nodes in the original GRN. The Boolean 
vector can be divided into three fragments of equal number of 
bits. The first fragment stores the information corresponding to 
ojfTargetlndex nodes, the second fragment stores the information 
of nodes that should have a specific initial state (i.e., nodes with 
the symbol "~"), and the last fragment stores the information 
regarding the nodes that should be perturbed (i.e., over-expressed 
or knocked-out). Figure 2A shows the polarity of each node in 
the unrolled GRN as a result of applying the genMISO function 
on the unrolled network. 

The MIS_OR() and MIS_AND() are the core functions of our 
MIS algorithm and ensure minimal and non-conflict properties 
of generated MIS vectors. Functions MIS_OR() and MIS_AND() 
merge the list of MIS vectors in a Boolean OR or AND man- 
ner as described in Algorithm 4. The MIS_OR() function checks 
for setwise containment of an MIS vector from one input list 
in another input list (Lines 5 and 7) and maintains the mini- 
mal property of generated MIS vectors by dropping MIS patterns 
that are not minimal in Lines 6 and 8 of Algorithm 4. The 
MIS_AND() function, checks for conflict between the MIS vec- 
tors in two input lists (Lines 23-25). Before merging any two MIS 
vectors, it checks if the resulting MIS vector would give rise to 
a conflicting perturbations (i.e., the same node over-expressed 
in one input MIS and knocked-out in the other MIS). Any such 
MIS vectors are dropped from the merged list of two input MIS 
vectors. Small examples further demonstrating the MIS_AND() 
and MISJDRQ functions are described in Figures 2B,C, respec- 
tively. In Figure 2B, the MIS pattern at the input node Ei , which 
requires two perturbations involving over-expression of node B 
and knock-down of node Q is dropped from the merged MIS list 



at the output, as it is not a minimal pattern and can be replaced 
by the minimal MIS at the input node Bi , which only requires one 
perturbation involving over-expression of node B. 

Function genMISO in Line 5 of Algorithm 1 returns mini- 
mal perturbation sets obtained by traversing the unrolled GRN. 
However, presence of feedback loops may prevent some of these 
MIS patterns from generating the desired steady state in actual 
simulations. Therefore, in order to ensure that the generated MIS 
can indeed give rise to desired steady state, we simulate the gen- 
erated MIS patterns using the the function simulateMISQ, which 
essentially uses the Algorithm we developed in Garg et al. (2008) 
to model gene perturbations and checks if the resulting steady 
state are indeed the ones that are expected from the comp_MIS() 
function. The simulateMISQ function also checks for minimality 
of generated MIS patterns. These two additional steps, ensure that 
the list of MIS patterns resulting from comp_MIS() are indeed a 
minimal list of perturbed genes that can push the system into a 
desired cellular phenotype. 

3. RESULTS 

We show the application of our algorithms on T-Helper cell and 
Growth vs. Apoptosis gene regulatory networks. 

3.1. T-HELPER GRN 

We have previously demonstrated existence of three steady states 
in a Boolean model of the T-helper GRN [Figures 3A,B, (Garg 
et al., 2008)]. These three steady states correspond to molecu- 
lar profiles observed in ThO, Thl, and Th2 cells, which can be 
distinguished at the molecular level by their pattern of cytokine 
secretion that play a central role in cell mediated immunity (Thl 
cells) and humoral responses (Th2 cells). 

When we applied our MIS algorithm on the GRN of 
Figure 3A, it reported 22 MIS patterns that can drive the ThO 
steady state into Thl steady state. Out of these 22 MIS patterns, 8 
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FIGURE 3 I (A) T-Helper Gene Regulatory Network. Green arrow-headed 
edges represent activating interactions and red colored round-headed edges 
represent inhibiting edges. (B) Profile of three steady states present in 
T-Helper GRN representing ThO, Thl, and Th2 cellular phenotypes. (C) MIS 
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patterns indicating gene-perturbations in-order to generate Thl steady states. 
These MIS patterns have no pre-requirement on the initial state of the 
network. (D) MIS patterns to push the system into Thl phenotype when the 
system is initially in ThO steady state. 
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patterns (Figure 3C) do not have any pre-requirement on initial 
state of the network and can lead to Thl steady state indepen- 
dent of the initial state of the network. The remaining 14 MIS 
(Figure 3D) would lead to Thl steady state only if the network is 
initially in the ThO state. It is interesting to note that all the MIS 
patterns are composed of either a single or at most two simultae- 
nous perturbations. Most of the gene perturbation combinations 
among the 14 MIS patterns in Figure 3D are well-known gene 
perturbations for biasing the T-Helper cellular differentiation 
toward Thl cell types and have been used as target for well-known 
drug molecules to enhance immune response. The esdsting meth- 
ods to find MIS patterns, which do not take into account initial 
state of the network, will only be able to identify 8 MIS patterns 
listed in Figure 3C. All these 8 MIS patterns require either T- 
bet over-expression or GATA3 knock-down in combination with 
few other perturbations. However, both T-bet and GATA3 are 
known to be difficult to target using drug compounds and can 
only be supressed or activated by using siRNAs or through indi- 
rect stimulation of upstream cytokines (Weigmann and Neurath, 
2002; Usui et al, 2003; Liberman et al, 2009, 2012; Chou et al, 
2010). More interesting perturbations identified by the remaining 
14 MIS patterns in Figure 3D highlight the importance of consid- 
ering initial state of the GRN in the algorithms presented in this 
manuscript. 

The Thl cells produce IFNg as their signature cytokine secre- 
tion profile. Secreted IFNg can bind to its receptor IFNgR present 
on the Thl cell surface leading to activation of signaling pathways 
involved in Thl cell differentiation and maintenance of Thl state 
(NoveUi et al, 1997; Murphy and Reiner, 2002). The MIS pat- 
terns, MISInit 8 and 9, in Figure 3D represent over-expression of 
IFNg or IFNgR, respectively, that are required for differentiation 
of ThO cells to Thl phenotype. The MIS pattern MISInit 3 repre- 
sents activation of T cell receptor (TCR) through external ligands, 
which can lead to transcriptional expression of various cytokines 
involved in T cell differentiation. The remaining MIS patterns 
represent other well-known modes of differentiating ThO to Thl 
cell types either through production of IFNg in T helper cells or 
by activating proteins downstream of IFNg signaling (Mendoza, 
2005; Mendoza and Xenarios, 2006 ). These modes of activation 
include expression of external ligands such as IL-18, IL-12, and 
IFN-b {MISInit 5, 11 and 14), or by directly stimulating their 
target receptors IL-18R, IL-12R, and IFN-bR (MISInit 6, 1 and 
12), increased expression of intermediate Kinases IRAK and JAKl 
{MISInit 7 and 10), or by expressing transcription factors STATl, 
STAT4, or NFAT {MISInit 13, 2 and 4). The compatible off-target 
perturbations in the MIS patterns listed in Figure 3D, are mostly 
related to down-regulating the expression of ligands and other 
proteins involved in maintenance of Th2 cell type, and differen- 
tiation of ThO to Th2 cell types (Mendoza, 2005; Mendoza and 
Xenarios, 2006 ). 

3.2. GROWTH vs. APOPTOSIS GRN 

We identified key proteins that are known to play a crucial role in 
maintaining the balance between growth and apoptosis signals in 
the cancer pathways. We then constructed the GRN representing 
interactions between these proteins by identifying experimentally 
validated functions defining these interactions from the literature. 



Boolean formulation of these interactions are summarized in the 
GRN in Figure 4A and literature evidence supporting these inter- 
actions are listed in Supplementary Table 1. The GRN can be 
divided among four modules — PI3K, AKT, p53, and mTORCl — 
to represent four sub-networks that are mostly comprised of 
linear paths (or no feedback loops). These four modules interact 
with each other through positive and negative interactions giving 
rise to multiple feedback loops in the resulting GRN. As a result 
of these feedback loops, the GRN in Figure 4A gives rise to four 
Boolean attractors when simulated using our Boolean modeling 
toolbox genYsis (Garg et al., 2008). Both the attractors show 
oscillating apoptosis and growth signals demonstrating the abil- 
ity of the pathway to self-regulate the cellular growth. Figure 4B 
shows the average growth and apoptosis signals in these attrac- 
tors. An equal distribution of states displaying high growth and 
high apoptosis in these Boolean attractors highlight the balance 
maintained by the feedback loops in the constructed GRN. 

We modeled the impact of two well-known cancer mutations 
in genes p53 and TSC2 on the attractors of GRN in Figure 4A. 
The p53 gene is one of the most frequently mutated genes in 
human cancers (Bourdon, 2007; Vousden and Lane, 2007). The 
p53 mutation leads to constitutive inhibition of its functional- 
ity and suppresses apopotosis signals, leading to uncontrolled cell 
growth or proliferation. Our simulation results also capture this 
effect of p53 mutation, wherein the average apoptosis signals in 
the attractor states decreases to 0 and growth signals stay high as 
compared to attractors in wild-type GRN (Figure 4B). We also 
modeled the impact of mTORCl and PI3K inhibition (repre- 
senting the affect of drug compounds such as Rapamycin and 
Wortmannin, respectively) on restoring the balance of growth vs. 
apoptosis in the presence of p53 mutation (Figure 4B). From the 
simulation results in Figure 4B, it is clear that while PI3K inhi- 
bition can balance the p53 mutation by completely suppressing 
growth signals, mTORCl inhibition can only moderately decrease 
the constitutive growth signals resulting from p53 mutations. This 
can be attributed to PI3K inhibitor acting upstream of mTORCl 
in order to block all the three distinct pathways leading to cell 
division and protein production (Welsh et al, 1997; Mendez et al., 
2001; Liang et al., 2003). On the contrary, in the presence of muta- 
tion in TSC2 gene (leading to loss of function of TSC2 and high 
growth vs. apoptosis signals in attractor states), simulation results 
show that growth signals cannot be regulated with PI3K inhibitors 
(Figure 4B). However, mTORCl inhibitor can counter the effect 
of TSC2 mutation by restoring the growth signals to low level of 
wild-type steady state levels. It has been shown in the experiments 
that cells treated with PI3K inhibitors show improved apoptosis 
response in a TSC2 expression dependent manner (Kolb et al., 
2005). On the other hand, greatly elevated mTORCl signaling has 
been reported in the presence of TSC2 mutation, which can be 
restored with mTORCl inhibitors (Kim et al., 201 1). 

Having established a confidence in our constructed GRN for 
modeling the balance between growth and apoptosis signals in 
cancer pathways, we next computed MIS patterns that can force 
the dynamics of the network to constitutive high growth and con- 
stitutive low apoptosis signals. Our algorithm reported a total 
of 54 MIS patterns, none of which requires the system to be 
in a specific initial state (Figure 4C). Most of the perturbations 
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FIGURE 4 I (A) GRN representing interactions among some proteins 
known to play a crucial role in maintaining a balance between apoptosis 
and cellular growth in cancer signaling pathways. (B) Distribution of growth 
and apoptosis signals in the steady states of the wild-type GRN [labeled 
(wt)l, or in the presence of p53 or TSC knock-out mutations (labeled p53- 



and TSC-), and of the GRN where the knock-out affects of drugs targeting 
mTORCl (labeled mTORCI-) and PI3K (labeled PI3K-) are modeled. (C) MIS 
patterns indicating gene-perturbations necessary for in-order to push the 
system into the state correponding to constitutive high-growth and 
low-apoptosis signals. 



(over-expression or knock-out) defined by these MIS patterns are 
well-known mutations in cancer cells. Most of the MIS patterns 
listed in Figure 4C require two or more simultaneous perturba- 
tions, highlighting that highly malignant cell phenotypes often 
comprise of multiple mutations. For example, MIS patterns that 
contain either p53 or TSC mutations also require mutations in 
other parts of the pathway to push the system into desired steady 
state. This observation is also supported from the simulation 
results in Figure 4B, where these mutations alone cannot give rise 
to constitutively high growth and low apoptosis signals in steady 
states. Another perturbation vector given by MIS 19 represents 
the over- activated AKT protein, which may explain a well-known 
mutation in AKT that leads to hyper-phosphorylated form of 
AKT in cancer cells. 

We next screened these MIS patterns against a database of 
miRNA and their target gene pairs in human genome, which 
we compiled using four different miRNA target prediction algo- 
rithms [miRANDA (John et al, 2004), TargetScan (Lewis et al, 
2005), mirDB (Wang and El Naqa, 2008) and RNA22 (Miranda 
et al., 2006)] to predict miRNA- target pairs and combined these 
with experimentally validated pairs [TarBase, (Papadopoulos 
et al, 2009)]. In silico miRNA target prediction algorithms suf- 
fer high rates of type I and type II errors (Watanabe et al., 
2007; Zhang and Verbeek, 2010 ). Therefore, in order to increase 
the quality of the predictions we considered only those predic- 
tions that were supported by at least two different prediction 



algorithms, or were experimentally validated. We restricted the 
application of our algorithm to 464 miRNA cancer type pairs that 
were collected from different experimental studies in literature 
and have been shown to be disregulated (either up-regulated or 
down-regulated) in various different human cancers or cancer cell 
lines (Sinha et al, 2008). 

If a given miRNA targets a gene identified as over-expressed 
in an MIS pattern, then we define the polarity of dis-regulation 
of that miRNA as down-regulated. Similarly, a given miRNA is 
said to be up-regulated if its target gene is knocked-down in 
the MIS pattern. When disregulation of only a single miRNA 
is assumed to lead to cancerous behavior (i.e., only looking at 
MIS patterns with all knock-down genes or all over-expressed 
genes), our algorithm predicts 33 up-regulated and 20 down- 
regulated miRNAs. The predicted miRNAs, their polarity, cor- 
responding cancer types and supporting MIS patterns are listed 
in Supplementary Table 2. Out of these predicted miRNAs, 38 
miRNAs were found to have the same polarity as has been seen 
in published experimental results and 15 predicted miRNAs did 
not match the polarity observed in published experiments (Sinha 
et al., 2008). However, we believe that the small list of miRNA with 
mismatched polarity may arise from the fact that two or more 
miRNA may be simultaneously perturbed in the cancer types 
studied in published results and require analyzing the effect of 
disregulation in more than one miRNA along with our list of MIS 
patterns. 



Frontiers in Physiology | Systems Biology 



December 2013 | Volume 4 | Article 361 | 6 



Garg et al. 



Computing MIS in GRNs 



Algorithm 1: Algorithm for generating all 
JVlIS vectors 
1 

2 
3 
4 
6 
6 
7 

8 end 



comp-MIS(G, T, outPolarity) 
begin 

targetlndex = unrollGRN(G, T, uG) 
Yio\ax\tY\targetIndex\ — outPolarity 
MIS = genMIS(uG, polarity, targetlndex) 
MlSsim = simuIateMIS(MIS, G, T) 
return MISsim 



Algorithm 2: Algorithm for unrolling the 

GRN. 

1 Um-ollGRN(G(V', /), aN, S'„, uG) 
// G : Original GRN 

// aN : Already unrolled Nodes on current path 
// Sn : Current Node to unroll 



2 begin 



9 
10 



11 
12 
13 

14 
15 
16 
17 

18 
19 
20 
21 



for i = 1 to \V\ do 
|_ newlndex[i] = — 1 

// Add Sn to visited nodes on current path 

aN = aN U 5„ 

for g £ inp(5n ) do 

// Check if Node g is already unrolled 
if g ^ aN then 

I newlndex[g] = UnrollGRN(G,aW,g) 
else 

flagUnroH = 0 

// Check if all inputs to Node g are 

already unrolled 
for g e inp(g) do 
if g ^ aN then 

1^ flagUnroll = 1 

if flagUnroll — 1 then 

I newlndexis] = UnrollGRN(G, aN, g) 
else 

1^ newlndex[g] — orgIndex[(f| 

outlndex = indexCount[5'n] + 1 
composeFunc(G, Sn, newlndex[], outlndex, uG) 
offTargetIndex[o«t/n(iea;] = aN 

aN = aN\S„ 

// Return new node corresponding to Sn 
return outhidex 



22 

23 end 

24 composeFunc(G(y, f),Sn, newlndex[ ], outlndex, uG) 

25 begin 
26 
27 
28 
29 

30 end 



/new = duplicate_BooLfunc(/s^) 
renamelnputs(fnew, newlndex[]) 
renameOutput(fnew, outlndex) 

uG = uG U /new 



Algorithm 3: Algorithm for generating MIS 
vectors 

1 genMIS(uG, citrrNoiie, polarity) 

2 begin 



MISinp — mergeMIS(uG,c«rrJVo(fe, polarity) 
MIScurr — newMIS(uG, cwrrA/^Ofie, polarity) 
MISout = MIS.OR(MIScurr, MISinp) 
return MISout 



3 
4 

5 
6 

7 end 



8 mergeMIS(uG, curriVode, polarity) 

9 begin 
ID 
11 
12 

13 
14 



15 
16 



17 

18 
19 
20 

21 
22 
23 

24 
25 
26 

27 
28 
29 



flagOr = 0 

if polarityfc-urrTVof/e] = 1 then 

if BooLfunc(cMrrA'ode, uG) is of type OR 
then 

I flagOR = 1 
else 

// For output 1 treat as AND gate 

else if ];>Q>\&.v'\ty[currNode\ — 0 then 

if BooLfunc(c'urrNo(ie, uG) is of type AND 
then 

// For output 0 treat as OR gate 
flagOR = 1 

if flagOR = 1 then 

MISout = initMISVector(O) 
for g G inpNodes(BooLfunc(c«rr Af ode, uG)) 
do 

polarity[g] — polarityfcurrWocie] 
MlStmp = genMIS(uG,g, polarity) 
MISout = MIS_OR(MISout, MIStmp) 

else 

MISout = initMISVector(l) 
for g G inpNodes(Bool_func(«trr'Afo6ie, uG)) 
do 

polarity[g] = polarityfcurrTVode] 
MIStmp = genMIS(uG,g, polarity) 
MISout = MIS_AND(MISout, MIStmp) 

return MISout 



30 

31 end 

32 newMIS(uG,CMrrAf ode, polarity) 

33 begin 
34 
35 
36 
37 
38 
39 



40 
41 
42 
43 

44 
45 
46 
47 
48 
49 



50 
51 end 



currN odelndex — orgIndex[ci/rrA^ode] 
MIStmp = initMISVector(l) 
if currNode = initial^tate_node then 

I index — 4 
else 
1^ index = 2 

if polarityfcMrrWode] = 0 then 

I MIStmp[irtdex * currN odelndex] = 0 
else 

1^ MIStmp[in(fex * currN odelndex + 1] = 0 

for g G offTargetIndex[cMrrAfode] do 
nodelndex = orgIndex[g] 
if polarity[(j] = 0 then 

I MIStmp[6 * nodelnd^ex] = 0 
else 

|_ MIStmp[6 * nodelndex + 1] = 0 
return MIStmp 



FIGURE 5 I Algorithms for computing the MIS patterns given a Boolean GRN and the profile of desired steady states. 
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Algorithm 4: Algorithm for combining list 
of MIS vectors. 

1 MlS.OR{listA,listJi) 

2 begin 

3 for i = 1 to |Hst.l| do 

4 for i = 1 to \list.2\ do 

5 if 
genePert(H5t_l|i|) C genePert(list-2[j]) 
then 

1^ drop j from HstJ2 
else if 

genePert(/isi_2|j]) C genePert(/is4_l[i]) 
then 

drop i from listA 
break loop 



8 
9 
10 



11 
12 



MISnew = list A U /ist_2 
return MISnew 



13 end 

14 MIS JiND{listA,listJ2) 
IB begin 

16 
17 
18 
19 
20 
21 
22 
23 
24 
25 



26 
27 

28 
29 



30 



MISnew = III 
for i = 1 to |H«i-l| do 

for j = 1 to \list-2\ do 
isZero = 0 

for fc = 1 to 2 * nNodes do 



P 



: /i«t_l|i|[2 * fc] • Hst_2|il[2 * fc] 



= listA[i\\2*k+l]-list.2[j]{2*k+l] 
if p1 + pi = 0 then 
isZero = 1 
break loop 



:p- 



MIStmp\2 * k] -- 
MIStmp[2 * fc 4 

if isZero = 1 then 
MISnew = 

MlS.OR{M I Snew, MIStmp) 



return MISnew 



31 end 



FIGURE 6 I Algorithms for merging MIS patterns. 



4. DISCUSSION 

In this manuscript, we have presented an efficient approach to 
generate a list of minimal sets of gene perturbations that can 
push the dynamics of GRN into a specific steady state (represent- 
ing a given cellular phenotype). Our algorithm for computing 
MIS patterns follows a branch and bound approach, where the 
unrolled network is scanned for MIS patterns in a depth-first 
manner. If the network being simulated has many feedback loops, 
then the current implementation of Algorithms proposed in this 
manuscript may be inefficient due to the large size of the unrolled 
network (Supplementary Table 3). However, the unrolled net- 
work can be scanned for MIS patterns in an efficient manner 
by a parallel implementation of the function genMlSQ described 
in Algorithm 3. The efficiency of MIS computation can be fur- 
ther enhanced by parallelizing the function simulateMISQ, which 



simulates and tests the minimality of the MIS patterns generated 
by the function genMISQ. 

The algorithms proposed here can be useful in various exper- 
imental settings where one would like to enumerate multiple 
options to regulate the dynamics of GRNs. We demonstrate one 
such application of our algorithm with respect to T-Helper GRN. 
The T-helper GRN shown in Figure 3A has been previously 
shown to effectively model the precursor ThO cells and effector 
Thl andTh2 cells (Mendoza andXenarios, 2006 ). Understanding 
the molecular mechanisms that regulate the differentiation pro- 
cess from ThO toward either Thl or Th2 is very important, since 
an immune response biased toward the Thl phenotype result in 
the appearance of autoimmune diseases, and an enhanced Th2 
response can originate allergic reactions (Murphy and Reiner, 
2002; Agnello et al., 2003). We previously demonstrated how one 
can efficiently simulate the effect of gene perturbations in the T- 
Helper GRN (Garg et al, 2008, 2009). However, it would be of 
utmost interest to generate a list of possible perturbations that 
can transition a cell from one steady state to another. For exam- 
ple, MIS patterns required to transition a cell from ThO to Thl cell 
state would indicate different possible treatments to stimulate the 
auto-immune response of the body. The MIS patterns generated 
by our algorithm captures some of the well-known perturbations 
that have been experimentally validated to differentiate T-Helper 
cells into Thl phenotype. Gene expression levels can often exist at 
more than two expression states. The algorithms proposed here 
can be extended to multiple expression levels of nodes (such as 
Low/Medium/High) by encoding them into Boolean rules (Garg 
et al, 2007). 

The GRN presented here for modeling the growth vs. apop- 
tosis signals in cancer pathways, consists of various feedback 
loops that ensures the balance between the growth and apop- 
tosis pathways. It is a well-accepted fact that mutations leading 
to permanent loss or gain of function of genes in the feedback 
loops of signaling pathways can disregulate the delicate balance 
between pro-growth and pro-apoptopic cellular signals. If these 
mutations are in a favor of pro-growth or anti-apoptosis signals, 
then a cell is said to have a pre-disposition toward uncontrolled 
growth (and hence proliferation). In such a scenario, a cell (car- 
rying mutations) undergoes uncontrolled proliferation and can 
subsequently lead to the formation of tumors. Most cancer thera- 
pies either try to restore the normal expression of mutated genes 
directly or counteract the impact of mutated genes by targeting 
other genes (or proteins) in the pathway. Understanding how dif- 
ferent genes (and proteins) regulate each other in these pathways 
is therefore of major interest in the development of treatments for 
various cancers. Here, we show how the impact of such mutations 
can be studied with respect to our proposed GRN by modeling 
p53 and TSC mutations. In Figure 4B, differential response of 
therapies targeting PI3K and mTORCl nodes in the presence of 
p53 and TSC mutations indicate the importance of taking into 
account gene mutations when deciding upon the drug therapies. 
Whereas, only single mutations are simulated in Figure 4B, in a 
real world scenario, multiple mutations can be present simulta- 
neously. In such a scenario, activity of multiple genes/proteins 
may have to be targeted for the drug therapy to be effective. Such 
compatible sets of genes/proteins which would be suitable for 
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manipulation by one or more drug compounds can be quickly 
discovered by enumerating all MIS vectors of a GRN. Here, we 
demonstrate one such application by listing MIS vectors that can 
lead to high growth and low apoptosis signals in steady state of 
GRN of Figure 4A. 

We also demonstrate an example of how the MIS patterns gen- 
erated by our algorithm could be used to predict miRNAs which, 
when over or under-expressed, could lead to a cancerous pheno- 
type of cells. Aberrant expression of miRNAs is known to play 
a major role in the development of cancers due to their impor- 
tance in various biological processes such as cellular proliferation 
and apoptosis (Subramanian and Steer, 2010; Yu et al, 2010). 
However, despite much interest in this area, the mechanism of 
action of miRNAs in disease remains largely unknown. The cur- 
rent view is that miRNAs have evolved to coordinately regulate 
cellular processes; thus, whilst the number of miRNAs is rela- 
tively small compared to the number of genes [ < 1000 in humans, 
(Griffiths-Jones et al, 2008)], complex regulation mechanisms 
can be achieved through the combined actions of multiple miR- 
NAs acting in a temporal and spatial manner. In our analysis, of 
53 miRNAs predicted to influence the cancerous phenotype, 38 
(~71%) show the same polarity as published data, demonstrating 
the potential of this approach. It remains to be seen whether the 
remaining 15 predicted miRNAs that do not show the same polar- 
ity as published data, have an impact on defining the cancerous 
phenotype. 

In addition to applications shown in this manuscript, such as 
discovering minimal functional mutations in a disease phenotype 
and screening drug compound library, one can use the MIS 
vectors to guide experimental setups. Normally, in the absence of 
information of MIS patterns, one would have to try all possible 
combinations (3^) of N different perturbations (compounds or 
treatments) of genes/proteins in a given signaling pathway, which 
can quickly lead to a large number of experiments for even a 
modestly large GRN. Even when number of simultaneous per- 
turbations to test are restricted to two, it becomes unfeasible to 
perform all possible two-combinations of available compounds. 
In such a scenario, MIS patterns generated by our algorithm 
can be very useful. The number of MIS patterns that can lead 
to a desired steady state can be significantly smaller than the 
number of all possible perturbations as has been seen from the 
application of our algorithm on the T-Helper and Apoptosis 
vs. Growth GRNs (Figures 3, 4 and Supplementary Table 3). 
Generating such MIS patterns algorithmically using an approach 
presented here can therefore reduce both the number of 
simulations and experiments required, and can provide 
model-driven insight into which subsets of genes should be 
knocked-down and over-expressed to obtain a desired cellular 
phenotype. 
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